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Viewpoint 

Institutional  Review  Boards 
and  Your  Research 


A  proposalfor  improving  the  review  proceduresfor  research  projects  that 
involve  human  subjects  and  their  associated  identifiable  private  information. 


Researchers  in  computer 
science  departments 
throughout  the  U.S.  are 
violating  federal  law  and 
their  own  organization’s 
regulations  regarding  human  sub¬ 
jects  research — and  in  most  cases 
they  don’t  even  know  it.  The  violations 
are  generally  minor,  but  the  lack  of 
review  leaves  many  universities  open 
to  significant  sanctions,  up  to  and 
including  the  loss  of  all  federal  re¬ 
search  dollars.  The  lack  of  review  also 
means  that  potentially  hazardous  re¬ 
search  has  been  performed  without 
adequate  review  by  those  trained  in 
human  subject  protection. 

We  argue  that  much  computer  sci¬ 
ence  research  performed  with  the  In¬ 
ternet  today  involves  human  subject 
data  and,  as  such,  must  be  reviewed 
by  Institutional  Review  Boards — in¬ 
cluding  nearly  all  research  projects 
involving  network  monitoring,  email, 
Facebook,  other  social  networking 
sites  and  many  Web  sites  with  user¬ 
generated  content.  Failure  to  address 
this  issue  now  may  cause  significant 
problems  for  computer  science  in  the 
near  future. 

Prisons  and  Syphilis 

At  issue  are  the  National  Research  Act 
(NRA)  of  1974a  and  the  Common  Rule,b 


a  PL  93-348,  see  http://history.nih.gov/research/ 
downloads/PL93-348.pdf 
b  45  CFR  46,  see  http://www.hhs.gov/ohrp/hu- 
mansubjects/guidance/45cfr46.htm 


which  together  articulate  U.S.  policy 
on  the  Protection  of  Human  Subjects. 
This  policy  was  created  following  a 
series  of  highly  publicized  ethical 
lapses  on  the  part  of  U.S.  scientists 
performing  federally  funded  re¬ 
search.  The  most  objectionable  cases 
involved  human  medical  experimen¬ 
tation — specifically  the  Tuskegee 
Syphilis  Experiment,  a  40-year  long 
U.S.  government  project  that  delib¬ 
erately  withheld  syphilis  treatment 


from  poor  rural  black  men.  Another 
was  the  1971  Stanford  Prison  Experi¬ 
ment,  funded  by  the  U.S.  Office  of 
Naval  Research,  in  which  students 
playing  the  role  of  prisoners  were 
brutalized  by  other  students  playing  = 
the  roles  of  guards.  ~ 

The  NRA  requires  any  institution  1 
receiving  federal  funds  for  scientific  a 
research  to  set  up  an  Institutional  Re-  p 
view  Board  (IRB)  to  approve  any  use  a 
of  humans  before  the  research  takes  a 
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place.  The  regulation  that  governs 
these  boards  is  the  Common  Rule — 
“Common”  because  the  same  rule  was 
passed  in  1991  by  each  of  the  17  federal 
agencies  that  fund  most  scientific  re¬ 
search  in  the  U.S. 

Computer  scientists  working  in  the 
field  of  Human-Computer  Interaction 
(HCI)  have  long  been  familiar  with 
the  Common  Rule:  any  research  that 
involves  recruiting  volunteers,  bring¬ 
ing  them  into  a  lab  and  running  them 
through  an  experiment  obviously  in¬ 
volves  human  subjects.  NSF  grant  ap¬ 
plications  specifically  ask  if  human 
subjects  will  be  involved  in  the  research 
and  require  that  applicants  indicate  the 
date  IRB  approval  was  obtained. 

But  a  growing  amount  of  research 
in  other  areas  of  computer  science 
also  involves  human  subjects.  This 
research  doesn’t  involve  live  human 
beings  in  the  lab,  but  instead  involves 
network  traffic  monitoring,  email,  on¬ 
line  surveys,  digital  information  creat¬ 
ed  by  humans,  photographs  of  humans 
that  have  been  posted  on  the  Internet, 
and  human  behavior  observed  via  so¬ 
cial  networking  sites. 

The  Common  Rule  creates  a  four- 
part  test  that  determines  whether  or 
not  proposed  activity  must  be  reviewed 
by  an  IRB: 

1.  The  activity  must  constitute  sci¬ 
entific  “research,”  a  term  that  the  Rule 
broadly  defines  as  “a  systematic  inves¬ 
tigation,  including  research  develop¬ 
ment,  testing  and  evaluation,  designed 
to  develop  or  contribute  to  generaliz- 
able  knowledge. ”c 

2.  The  research  must  be  federally 
funded. d 

3.  The  research  must  involve  human 
subjects,  defined  as  “a  living  individual 
about  whom  an  investigator  (whether 
professional  or  student)  conduct¬ 
ing  research  obtains  (1)  data  through 
intervention  or  interaction  with  the 
individual,  or  (2)  identifiable  private 
information.”6 

4.  The  research  must  not  be  “ex¬ 
empt”  under  the  regulations/ 

The  exemptions  are  a  kind  of  safety 
valve  to  prevent  IRB  regulations  from 
becoming  utterly  unworkable.  For 


c  §46.102  (d) 
d  §46.103  (a) 
e  §46.102(f) 
f  §46.101  (b) 


Much  computer 
science  research 
performed  with 
the  Internet  today 
involves  human 
subject  data  and, 
as  such,  must 
be  reviewed  by 
Institutional 
Review  Boards. 


computer  scientists  the  relevant  ex¬ 
emptions  are  “research  to  be  conduct¬ 
ed  on  educational  practices  or  with  ed¬ 
ucational  tests”  (§46.101(b)(l&2));  and 
research  involving  “existing  data,  doc¬ 
uments,  [and]  records...”  provided  that 
the  data  set  is  either  “publicly  avail¬ 
able”  or  that  the  subjects  “cannot  be 
identified,  directly  or  through  identi¬ 
fiers  linked  to  the  subjects”(§46. 101(b) 
(4)).  Surveys,  interviews,  and  observa¬ 
tions  of  people  in  public  are  generally 
exempt,  provided  that  identifiable  in¬ 
formation  is  not  collected,  and  pro¬ 
vided  that  the  information  collected, 
if  disclosed,  could  not  “place  the  sub¬ 
jects  at  risk  of  criminal  or  civil  liabil¬ 
ity  or  be  damaging  to  the  subjects’ 
financial  standing,  employability,  or 
reputation”(§46.101(b)(2)(i&ii)). 

IRBs  exist  to  review  proposed  re¬ 
search  and  protect  the  interests  of 
the  human  subjects.  People  can  par¬ 
ticipate  in  dangerous  research,  but  it’s 
important  that  people  are  informed, 
if  possible,  of  the  potential  risks  and 
benefits — both  to  themselves  and  to 
society  at  large. 

What  this  means  to  computer  sci¬ 
entists  is  that  any  federally  funded 
research  involving  data  generated  by 
people  that  is  “identifiable”  and  not 
public  probably  requires  approval  in 
advance  by  your  organization’s  IRB. 
This  includes  obvious  data  sources 
like  network  traffic,  but  it  also  in¬ 
cludes  not  so  obvious  sources  like 
software  that  collects  usage  statistics 
and  “phones  home.” 


Complicating  matters  is  the  fact  that 
the  Common  Rule  allows  organiza¬ 
tions  to  add  additional  requirements. 
Indeed,  many  U.S.  universities  require 
IRB  approval  for  any  research  involving 
human  subjects,  regardless  of  funding 
source.  Most  universities  also  prohibit 
researchers  from  determining  if  their 
own  research  is  exempt.  Instead,  U.S. 
universities  typically  require  that  all 
research  involving  human  beings  be 
submitted  to  the  school’s  IRB. 

This  means  a  broad  swath  of  “ex¬ 
empt”  research  involving  publicly 
available  information  nevertheless  re¬ 
quires  IRB  approval.  Performing  social 
network  analysis  of  Wikipedia  pages 
may  fall  under  IRB  purview:  Wikipedia 
tracks  which  users  edited  which  pages, 
and  when  those  edits  were  made.  Us¬ 
ing  Flickr  pages  as  a  source  of  JPEGs 
for  analysis  may  require  IRB  approval, 
because  Flickr  pages  frequently  have 
photos  of  people  (identifiable  informa¬ 
tion),  and  because  the  EXIF  “tags”  that 
many  cameras  store  in  JPEG  images 
may  contain  serial  numbers  that  can 
be  personally  identifiable.  Analysis  of 
Facebook  poses  additional  problems 
and  may  not  even  qualify  as  exempt: 
not  only  is  the  information  person¬ 
ally  identifiable,  but  it  is  frequently  not 
public.  Instead,  Facebook  information 
is  typically  only  available  to  those  who 
sign  up  for  the  service  and  get  invited 
into  the  specific  user’s  network. 

We  have  spoken  with  quite  a  few 
researchers  who  believe  the  IRB  regu¬ 
lations  do  not  apply  to  them  because 
they  are  working  with  “anonymized” 
data.  Ironically,  the  reverse  is  probably 
true:  IRB  approval  is  required  to  be 
sure  the  data  collection  is  ethical,  that 
the  data  is  adequately  protected  prior 
to  anonymization,  and  that  the  ano¬ 
nymization  is  sufficient.  Most  schools 
do  not  allow  the  experimenters  to  an¬ 
swer  these  questions  for  themselves, 
because  doing  so  creates  an  inherent 
conflict  of  interest.  Many  of  these  re¬ 
searchers  were  in  violation  of  their 
school’s  regulations;  some  were  in  vio¬ 
lation  of  federal  regulations. 

How  to  Stop  Worrying 
and  Love  the  IRB 

Many  IRBs  are  not  well  equipped  to 
handle  the  fast-paced  and  highly  tech¬ 
nical  nature  of  computer-related  re¬ 
search.  Basic  questions  arise,  such  as, 
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Are  Internet  Protocol  addresses  per¬ 
sonally  identifiable  information?  What 
is  “public”  and  what  is  not?  Is  encrypt¬ 
ed  data  secure?  Can  anonymized  data 
be  re-identified?  Researchers  we  have 
spoken  with  are  occasionally  rebuffed 
by  their  IRBs — the  IRBs  insist  that  no 
humans  are  involved  in  the  research — 
ignoring  that  regulations  also  apply  to 
“identifiable  private  information.” 

Another  mismatch  between  com¬ 
puter  science  research  and  IRBs  is 
timescale.  CS  research  progresses  at  a 
much  faster  pace  than  research  in  the 
biomedical  and  behavioral  fields.  In 
one  case  we  are  aware  of,  an  IRB  took 
more  than  a  year  to  make  a  decision 
about  a  CS  application.  But  even  two 
or  three  months  to  make  a  decision — 
typical  of  many  IRBs — is  too  slow  for  a 
student  in  a  computer  science  course 
who  wants  to  perform  a  social  network 
analysis  as  a  final  project. 

For  example,  one  of  our  studies, 
which  involved  observing  how  mem¬ 
bers  of  our  university  community  re¬ 
sponded  to  simulated  phishing  attacks 
over  a  period  of  several  weeks,  had 
to  be  shortened  after  being  delayed 
two  months  by  an  understaffed  IRB. 
With  the  delayed  start  date,  part  of 
the  study  would  have  taken  place  over 
winter  break,  when  few  people  are  on 
campus.  Another  study  we  worked  on 
was  delayed  three  months  after  an 
IRB  asked  university  lawyers  to  review 
a  protocol  to  determine  whether  it 
would  violate  state  wiretap  laws. 

In  another  case,  researchers  at  In¬ 
diana  University  worked  with  their 
IRB  and  the  school’s  network  secu¬ 
rity  group  to  send  out  phishing  attacks 
based  on  data  gleaned  from  Facebook.4 
Because  of  the  delays  associated  with 
the  approval  process,  the  phishing 
messages  were  sent  out  at  the  end  of 
the  semester,  just  before  exams,  rather 
than  at  the  beginning  of  the  semes¬ 
ter.  Many  recipients  of  the  email  com¬ 
plained  vociferously  about  the  timing. 

Another  reason  computer  scientists 
have  problems  with  IRBs  is  the  level 
of  detail  the  typical  IRB  application 
requires.  Computer  scientists,  for  the 
most  part,  are  not  trained  to  carefully 
plan  out  an  experiment  in  advance,  to 


g  T.  Jagatic,  N.  Johnson,  M.  Jakobsson,  and  F. 
Menczer.  Social  phishing.  Commun.  ACM  50, 
10  (Oct.  2007),  94-100. 


It  is  becoming 
increasingly  easy 
to  collect  human 
subjects  data  over 
the  Internet  that 
needs  to  be  properly 
protected  to  avoid 
harming  subjects. 


figure  out  which  data  will  be  collected, 
and  then  to  collect  the  results  in  a  man¬ 
ner  that  protects  the  privacy  of  the  data 
subjects.  (Arguably,  computer  scien¬ 
tists  would  benefit  from  better  train¬ 
ing  on  experimental  design,  but  that 
is  a  different  issue.)  We  have  observed 
that  many  IRB  applications  are  delayed 
because  of  a  failure  on  the  part  of  CS 
researchers  to  make  these  points  clear. 

Finally,  many  computer  scientists 
are  unfamiliar  with  the  IRB  process 
and  how  it  applies  to  them,  and  may 
be  reluctant  to  engage  with  their  IRB 
after  having  heard  nothing  but  com¬ 
plaints  from  colleagues  who  have 
had  their  studies  delayed  by  a  slow 
IRB  approval  process.  While  the 
studies  that  CS  researchers  perform 
are  often  exempt  or  extremely  low 
risk,  it  is  becoming  increasingly  easy 
to  collect  human  subjects  data  over 
the  Internet  that  needs  to  be  prop¬ 
erly  protected  to  avoid  harming  sub¬ 
jects.  Likewise,  the  growing  amount 
of  research  involving  honeypots,  bot¬ 
nets,  and  the  behavior  of  anonymity 
systems  would  seem  to  require  IRBs, 
since  the  research  involves  not  just 
software,  but  humans — both  crimi¬ 
nals  and  victims. 

The  risks  to  human  subjects  from 
computer  science  research  are  not  al¬ 
ways  obvious,  and  the  IRB  can  play  an 
important  role  in  helping  computer  sci¬ 
entists  identify  these  risks  and  insure 
that  human  subjects  are  adequately 
protected.  Is  there  a  risk  that  data  col¬ 
lected  on  computer  security  incidents 
could  be  used  by  employers  to  identify 
underperforming  computer  security 
administrators?  Is  there  a  risk  that  ano¬ 


nymized  search  engine  data  could  be 
re-identified  to  reveal  what  particular 
individuals  are  searching  for?  Can  net¬ 
work  traffic  data  collected  for  research 
purposes  be  used  to  identify  copyright 
violators?  Can  posts  to  Livejournal  and 
Facebook  be  correlated  to  learn  the 
identities  of  children  who  are  frequent¬ 
ly  left  home  alone  by  their  parents? 

In  order  to  facilitate  more  rapid  IRB 
review,  we  recommend  the  develop¬ 
ment  of  a  new,  streamlined  IRB  appli¬ 
cation  process.  Experimenters  would 
visit  a  Web  site  that  would  serve  as  a 
self-serve  “IRB  kiosk.”  This  site  would 
ask  experimenters  a  series  of  questions 
to  determine  whether  their  research 
qualifies  as  exempt.  These  questions 
would  also  serve  to  guide  experiment¬ 
ers  in  thinking  through  whether  their 
research  plan  adequately  protects  hu¬ 
man  subjects.  Qualifying  experiment¬ 
ers  would  receive  preliminary  approval 
from  the  kiosk  and  would  be  permitted 
to  begin  their  experiments.  IRB  repre¬ 
sentatives  would  periodically  review 
these  self-serve  applications  and  grant 
final  approval  if  everything  was  in  order. 

Such  a  kiosk  is  actually  permissible 
under  current  regulations,  provided 
that  the  research  is  exempt.  A  kiosk 
could  even  be  used  for  research  that  is 
“expedited”  under  the  Common  Rule, 
since  expedited  research  can  be  ap¬ 
proved  by  the  IRB  Chair  or  by  one  or 
more  “experienced  reviewers.”11  In  the 
case  of  non-exempt  expedited  research, 
the  results  of  the  Kiosk  would  be  re¬ 
viewed  by  such  a  reviewer  prior  to  per¬ 
mission  being  given  to  the  researcher. 

Institutional  Review  Board  chairs 
from  many  institutions  have  told  us 
informally  that  they  are  looking  to 
computer  scientists  to  come  up  with 
a  workable  solution  to  the  difficulty 
of  applying  the  Common  Rule  to  com¬ 
puter  science.  It  is  also  quite  clear  that 
if  we  do  not  come  up  with  a  solution, 
they  will  be  forced  to  do  so.  B 


h  §46.110  (b) 
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